Learning Word Meta-Embeddings by Using Ensembles of Embedding Sets
نویسندگان
چکیده
Word embeddings – distributed representations of words – in deep learning are beneficial for many tasks in natural language processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured semantics. Instead of relying on a more advanced algorithm for embedding learning, this paper proposes an ensemble approach of combining different public embedding sets with the aim of learning meta-embeddings. Experiments on word similarity and analogy tasks and on part-of-speech tagging show better performance of metaembeddings compared to individual embedding sets. One advantage of meta-embeddings is the increased vocabulary coverage. We will release our meta-embeddings publicly.
منابع مشابه
Learning Word Meta-Embeddings
Word embeddings – distributed representations of words – in deep learning are beneficial for many tasks in NLP. However, different embedding sets vary greatly in quality and characteristics of the captured information. Instead of relying on a more advanced algorithm for embedding learning, this paper proposes an ensemble approach of combining different public embedding sets with the aim of lear...
متن کاملThink Globally, Embed Locally - Locally Linear Meta-embedding of Words
Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those methods capture complementary aspects of lexical semantics. Therefore, we believe that it is important to combine the existing word embeddings to produce more acc...
متن کاملLearning linear transformations between counting-based and prediction-based word embeddings
Despite the growing interest in prediction-based word embedding learning methods, it remains unclear as to how the vector spaces learnt by the prediction-based methods differ from that of the counting-based methods, or whether one can be transformed into the other. To study the relationship between counting-based and prediction-based embeddings, we propose a method for learning a linear transfo...
متن کاملPSDVec: A toolbox for incremental and scalable word embedding
PSDVec is a Python/Perl toolbox that learns word embeddings, i.e. the mapping of words in a natural language to continuous vectors which encode the semantic/syntactic regularities between the words. PSDVec implements a word embedding learning method based on a weighted low-rank positive semidefinite approximation. To scale up the learning process, we implement a blockwise online learning algori...
متن کاملImproved Answer Selection with Pre-Trained Word Embeddings
is paper evaluates existing and newly proposed answer selection methods based on pre-trained word embeddings. Word embeddings are highly effective in various natural language processing tasks and their integration into traditional information retrieval (IR) systems allows for the capture of semantic relatedness between questions and answers. Empirical results on three publicly available data s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1508.04257 شماره
صفحات -
تاریخ انتشار 2015